Introduction

A recent NPR article noted that the U.S. is still being flooded with opioid prescriptions by doctors and dentists. Upon reading the article, a friend pondered what the provided heat map for prescriptions would look like overlaid with suicide rates and poverty levels. This prompted the question “Is a larger rate of legal opioid prescription related to a higher numbers in suicides and poverty?”

To test the conjecture, we found data from the CDC in relation to suicide rates by county, poverty levels by county, and opioid prescription rate per county. The suicide rate averages (Avg_Suicide_Rate) are calculated per 100,000 people averaged between 2008 and 2014. Poverty levels (AvgP_Below_1.00x_FPL, AvgP_Below_1.50x_FPL, AvgP_Below_2.00x_FPL) are percentages of families in the county below intervals of the Federal Poverty Level (FPL) averaged between 2014 and 2018. We have additionally transformed the three intervals into one Poverty_Index for the county, more heavily weighed towards the percentage of families below the FPL than those above it. The higher the index, the more families there are near or in poverty for respective counties. The opioid prescription rate averages (Avg_Prescr_Rate) are calculated retail opioid prescriptions dispensed per 100 persons averaged between 2005 and 2015.

Hypotheses

Null Hypothesis: There no correlation between suicide rates, poverty level, and opioid prescription rates. \[H_0: \quad \beta_S , \beta_P , \beta_O > 0\] Alternate Hypothesis: There is correlation between suicide rates, poverty level, and opioid prescription rates.

In the context of our research question, a Type I error represents having some correlation between suicide, poverty, and opioid prescription rates but saying otherwise due to an unlikely event in which all values have had similar spikes over the years (this is BS and should totally be looked at). A Type II error would be having no correlation between suicide, poverty, and opioid prescription rates but saying otherwise due to…

Data Exploration

Basics

The mean opioid prescription rate per 100 persons is 93.63502 while the median is 88.14545, mean suicide rate per 100,000 people is 15.87807 while the median is 14.97576, and the mean poverty index is 19.21391 when looking only at families below two times the FPL whiles the median is 18.82.


Regression models

it is only appropriate to make predictions based on a regression model if:

  1. The data should show a linear trend.

  2. The distribution of residuals is approximately Normal.

  3. The residuals have constant variability.

Generate a null distribution for slopes, assuming variables independent. Then use infer to visualize the distribution and compute the p_value. Compare the P-value to the one given for slope in the regression table.

Generate a bootstrap distribution for slopes. Then use infer and the percentile method to create a 95% confidence interval on your slope. Compare the confidence interval here to the one given in the regression table. Finally, assess whether this claim is reasonable given your data set.


Opioid prescription rate may be a statistically significant predictor of suicide rate as the p value calculated is less than the significance level \(\alpha = 0.05\). The linear model, however, does not seems to be too practical, as the adjusted \(R^2 = 0.06084\). The calculated slope is \(\beta_1 = 0.031532\) with a confidence interval of \((0.026, 0.037)\).

This next part tells us whether it’s appropriate to make assumptions off the regression model based on the guidelines above.

The real question is, do we need to make assumptions off it and what does that mean? We’re just looking for any relationship.

The histogram of residuals is slightly right skewed, but still uni-modal and close to 0. This shows constant variability, a linear trend, and approximate normalicy.

Opioid prescription rate may be a statistically significant predictor of the percentage of families below 1x the FPL as the p value calculated is less than the significance level \(\alpha = 0.05\). The linear model, however, does not seems to be too practical, as the adjusted \(R^2 = 0.1608\). The calculated slope is \(\beta_1 = 3.5944\) with a confidence interval of \((3.21, 3.98)\).

Opioid prescription rate may be a statistically significant predictor of the percentage of families below 1.5x the FPL as the p value calculated is less than the significance level \(\alpha = 0.05\). The linear model, however, does not seems to be too practical, as the adjusted \(R^2 = 0.159\). The calculated slope is \(\beta_1 = 2.380\) with a confidence interval of \((2.12, 2.64)\).

Opioid prescription rate may be a statistically significant predictor of the percentage of families below 2.0x the FPL as the p value calculated is less than the significance level \(\alpha = 0.05\). The linear model, however, does not seems to be too practical, as the adjusted \(R^2 = 0.1632\). The calculated slope is \(\beta_1 = 0.086358\) with a confidence interval of \((0.077, 0.095)\).

Opioid prescription rate may be a statistically significant predictor of our calculated poverty index as the p value calculated is less than the significance level \(\alpha = 0.05\). The linear model, however, does not seems to be too practical, as the adjusted \(R^2 = 0.1652\). The calculated slope is \(\beta_1 = 0.066072\) with a confidence interval of \((0.059, 0.073)\).

If the null hypothesis were true, then the observed slope does not seem plausible because it is an extreme value rather than a more expected value near the center of the distribution. The P-value, assuming a significance level of \(\alpha = 0.05\), is 0 when rounded to the thousandth, which means we can reject the null-hypothesis and test for aspects of the alternative. The P-value here and the one generated by the linear regression above is the same when rounded to the thousandth.

Quick Summary

The mean opioid prescription rate per 100 persons is 93.63502 while the median is 88.14545, mean suicide rate per 100,000 people is 15.87807 while the median is 14.97576, and the mean poverty index is 19.21391 when looking only at families below two times the FPL whiles the median is 18.82.

Variable Mean Median Units
Opioid Prescription 93.63502 93.63502 Prescriptions per 100 persons
Suicide 15.87807 14.97576 Suicides per 100,000 people
Poverty Index 19.21391 18.82 N/A
Model Null Distribution P-Value LM P-Value Adjusted \(R^2\) Slope LM Slope CI Bootstrap CI
OPR x SR 0 < 2.2e-16 0.06084 0.031532 (0.026, 0.037) (0.02573316, 0.03740757)
OPR x P1.0 0 < 2.2e-16 0.1608 0.044857 (0.040, 0.050) (0.03914246, 0.05022237)
OPR x P1.5 0 < 2.2e-16 0.159 0.067001 (0.060. 0.074) (0.05913409, 0.07493694)
OPR x P2.0 0 < 2.2e-16 0.1632 0.086358 (0.077, 0.095) (0.07714104, 0.09526279)
OPR x PI 0 < 2.2e-16 0.1652 0.066072 (0.059, 0.073) (0.05869566, 0.07321176)

Let’s Put Stuff Together

Now consider how the association between suicide rate and opioid prescription rate would change if poverty brackets were taken into account. Create a plot to illustrate the relationship between all three variables (recall the use of the color argument in aes()). Just by looking at the plot and without fitting any models, does it appear that the relationship between the suicide rate and opioid prescription rate is the same for higher poverty counties as it is for lower poverty counties?


## # A tibble: 1 x 1
##   p_value
##     <dbl>
## 1       0

## # A tibble: 1 x 2
##   lower_ci upper_ci
##      <dbl>    <dbl>
## 1    0.147    0.277